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A Numerical Method to solve Optimal Transport 
Problems with Coulomb Cost 

Jean-David Benamou, Guillaume Carlier, and Luca Nenna 


Abstract In this paper, we present a numerical method, based on iterative Bregman 
projections, to solve the optimal transport problem with Coulomb cost. This is re¬ 
lated to the strong interaction limit of Density Functional Theory. The first idea is to 
introduce an entropic regularization of the Kantorovich formulation of the Optimal 
Transport problem. The regularized problem then corresponds to the projection of a 
vector on the intersection of the constraints with respect to the Kullback-Leibler dis¬ 
tance. Iterative Bregman projections on each marginal constraint are explicit which 
enables us to approximate the optimal transport plan. We validate the numerical 
method against analytical test cases. 


1 Introduction 

1.1 On Density functional theory 

Quantum mechanics for a molecule with N electrons boils down to the many- 
electron Schrodinger equation for a wave function \j/ G (in this paper, 

we neglect the spin variable). The limit of this approach is computational : in or¬ 
der to predict the chemical behaviour of H 2 O (10 electrons) using a 10 gridpoints 
discretization of M, we need to solve the Schrodinger equation on 10^** gridpoints. 
This is why Hohenberg, Kohn and Sham introduced, in [19] and [21], the Density 
Functional Theory (DFT) as an approximate computational method for solving the 
Schrodinger equation at a more reasonable cost. 
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The main idea of the DFT is to compute only the marginal density for one elec¬ 
tron 


p(xi) = J YNixi,X2--- ,XN)dX2---dXN, 


where Jn = \ 'Wix\,-- • ,xn)\^ is the joint probability density of electrons at positions 
xi, • • • jXat € instead of the full wave function y/. One scenario of interest for the 
DFT is when the repulsion between the electrons largely dominates over the kinetic 
energy. In this case, the problem can, at least formally, be reformulated as an Op¬ 
timal Transport (OT) problem as emphasized in the pioneering works of Buttazzo, 
De Pascale and Gori-Giorgi [6] and Cotar, Friesecke and Kliippelberg [10]. 


1.2 Optimal Transport 


Before discussing the link between DFT and OT, let us recall the standard optimal 
transport problem and its extension to the multi-marginal framework. Given two 
probability distributions p and v (on W^, say) and a transport cost c: x —> 
M, the optimal transport problem consists in finding the cheapest way to transport 
p to V for the cost c. A transport map between p and v is a Borel map T such 
that T#p = V i.e. v(A) = p{T^'-{A)) for every Borel subset A of The Monge 
problem (which dates back to 1781 when Monge [24] posed the problem of finding 
the optimal way to move a pile of dirt to a hole of the same volume) then reads 


min / c{x,T{x))p{dx). (1) 

T#l^=v 


This is a delicate problem since the mass conservation constraint T#p = v is highly 
nonlinear (and the feasible set may even be empty for instance if /r is a Dirac mass 
and V is not). This is why, in 1942, Kantorovich [20] proposed a relaxed formulation 
of (1) which allows mass splitting 


min 

ren{fi,v) 




c(x,y)y{dx,dy) 


( 2 ) 


where y & n{p,v) consists of all probability measures on x having p and v 
as marginals, that is: 


7 (A X K) = /r (a), VA Borel subset of 
y{RxB) = v{B), VB Borel subset of 


(3) 

(4) 


Note that this is a linear programming problem and that there exists solutions under 
very mild assumptions (e.g. c continuous and p and v compactly supported). A 
minimizing y in (2) is called an optimal transport plan and it gives the probability 
that a mass element in x be transported in y. Let us remark that if T is a transport 
map then it induces a transport plan yT{x,y) '■= p{x)d{y — T{x)) so if an optimal 
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plan of ( 2 ) has the form yf (which means that no splitting of mass occurs and 7 is 
concentrated on the graph of T) then T is actually an optimal transport map i.e. a 
solution to (1). The linear problem (2) also has a convenient dual formulation 

max / u{x)a{dx)+ / v{y)v(dy) (5) 

u,v\u(x)+v(y)<c(x,y) jRd jRd 

where u{x) and v( 7 ) are the so-called Kantorovich potentials. OT theory for two 
marginals has developed very rapidly in the 25 last years, there are well known 
conditions on c, and v which guarantee that there is a unique optimal plan which 
is in fact induced by a map (e.g. c=\x—y\^ and fx absolutely continuous, see Brenier 
[4]) and we refer to the textbooks of Villani [34, 35] for a detailed exposition. 

Let us now consider the so-called multi-marginal problems i.e. OT problems in¬ 
volving N marginals fli, • • • ,jJ.N and a cost c: —?■ M, which leads to the following 

generalization of ( 2 ) 


min / c{xi,-■ ■ ,XN)Y{dxi,-• • ,dxN) ( 6 ) 

where IT(/Xi ,■■■ ,/rA^) is the set of probability measures on having fli, • • • ,/X/v 

as marginals. The corresponding Monge problem then becomes 

min / c(xi,T 2 (xi),--- ,Tjv(xi))jUi(dxi). (7) 

Such multi-marginals problems first appeared in the work of Gangbo and Switch 
[16] who solved the quadratic cost case and proved the existence of Monge solu¬ 
tions. In recent years, there has been a lot of interest in such multi-marginal prob¬ 
lems because they arise naturally in many different settings such as economics [7], 
[29], polar factorization of vector fields and theory of monotone maps [17] and, of 
course, DFT [ 6 , 10, 8 , 14, 23, 11], as is recalled below. Few results are known about 
the structure of optimal plans for (7) apart from the general results of Brendan Pass 
[28], in particular the case of repulsive costs such as the Coulomb’s cost from DFT 
is an open problem. 

The paper is structured as follows. In Section 2, we recall the link between Den¬ 
sity Functional Theory and Optimal Transportation and we present some analytical 
solutions of the OT problem (e.g. optimal maps for radially symmetric marginals, 
for 2 electrons). In Section 3, we introduce a numerical method, based on itera¬ 
tive Bregman projections, and an algorithm which aims at refining the mesh where 
the transport plan is concentrated. In section 4 we present some numerical results. 
Section 5 concludes. 
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2 From Density Functional Theory to Optimal Transportation 


2.1 Optimal Transportation with Coulomb cost 


In Density Functional Theory [19] the ground state energy of a system (with N elec¬ 
trons) is obtained by minimizing the following functional w.r.t. the electron density 

p{x): 

E[p] = minF hkIp] + / Vext{x)p{x)dr (8) 

peM J 


where ^ = {p : —>• Mjp > 0,^/p € //* p(x)(ix = A^}, 

Z 

Vext '■= -r is the electron-nuclei potential (Z and R are the charge and the 

\x-R\ 

position of the nucleus, respectively) and Fhk is the so-called Hohenberg-Kohn 
which is defined by minimizing over all wave functions \j/ which yield p: 


Fhk[p] = min h^T[\j/]+Vee[v] (9) 

where is a semiclassical constant factor, 

T[v] = ll---lLli\^xM^dxi---dxN 
is the kinetic energy and 

Vee = I ■■■ / I Wl^dxi ■■■dXN 

\Xi X j \ 

is the Coulomb repulsive energy operator. 

Let us now consider the Semiclassical limit 

lims^omin^^p h^T[\if]+Vee[w] 

and assume that taking the minimum over \j/ commutes with passing to the limit 
A —> 0 (Cotar, Friesecke and Kluppelberg in [10] proved it for N = 2), we obtain the 
following functional 

= mn /••• f EE 1^ Iwl^dxv-dxN (10) 

where is the minimal Coulomb repulsive energy whose minimizer character¬ 
izes the state of Strictly Correlated Flectrons(SCE). 

Problem (10) gives rise to a multi-marginal optimal transport problem as (6) by 
considering that 

• according to the indistinguishability of electrons, all the marginals are equal to 

P, 
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• the cost function is given the electron-electron Coulomb repulsion, 

C(X1,-.,XA,) ^ (11) 

!=U>( I-*' 

• we refer to 7 a^ = | (//(xi, • • • ,XA^) p (which is the joint probability density of elec¬ 
trons at positions xi, • • • ,xjv € as the transport plan. 

The Coulomb cost function (11) is different from the costs usually considered in 
OT as it is not bounded at the origin and it decreases with distance. So it requires a 
generalized formal framework, but this is beyond the purpose of this work (see [6] 
and [10]). Finally (10) can be re-formulated as a Kantorovich problem 

Vg^^[p]= min / c(xi,-■ ■ ,xn)Yn(xi,-■ ■ ,XN)dxi ■ ■-dxN (12) 

'ri(yiv)=P,!=l,---,NjR3'V 

where 

Yn{xi , * * * , * * * , dXi—l , dXi^l 1 1 dxi\[ 

is the /—th marginal. As mentioned in section 1.2 if the optimal transport plan Yn 
has the following form 


7Af(xi,--- ,Xiv) =p(xi)5(x2-/J(xi))---5(xAr-/]^(xi)) (13) 

then the functions /]*: are the optimal transport maps (or co-motion func- 

tions) of the Monge problem 


N N 


s-t- fi#p = p, i = 2,...,N, /i(x)=x. 


p{x)dx 


(14) 


Remark 1. (Physical meaning of the co-motion function) /] (x) determine the posi¬ 
tion of the /-th electron in terms of x which is the position of the “1 st”electron ; 

defines a system with the maximum possible correlation between the relative 
electronic positions. 

In full generality, problem (14) is delicate and proving the existence of the co¬ 
motion functions is difficult. However, the co-motion functions can be obtained via 
semianalytic formulations for spherically symmetric atoms and strictly ID systems 
(see [10], [33], [22], [8]) and we will give some examples in the following section. 

Problem (12) admits a useful dual formulation in which the so called Kantorovich 
potential u plays a central role 

=tXlSLX{N J u{x)p{x)dx S.t. ^m(x,) < c(xi,...,XAr)}. (15) 
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Because c is invariant by permutation, there is a single dual Kantorovich potential 
for all all marginal constraints. Moreover, this potential u(x) is related to the co¬ 
motion functions via the classical equilibrium equation (see [33]) 


Vm(x) 


y x-fi{x) 

ti k-Zi'WP' 


(16) 


Remark 2. (Physical meaning of (16)) The gradient of the Kantorovich potential 
equals the total net force exerted on the electron in x by electrons in /2 (x), • • • , /n (x) . 


2.2 Analytical Examples 


2.2.1 The case N = 2 and d=\ 


In order to better understand the problem we have formulated in the previous sec¬ 
tion, we recall some analytical examples (see [6] for the details). 

Let us consider 2 particles in one dimension and marginals 


P\{x)=P2{x) 


a if\x\ < fl/2 
0 otherwise. 


(17) 


After a few computations, we obtain the following associated co-motion function 


If we take 



(18) 


we get 


P\{x)=p 2 {x) 


a — \x\ 


defined in [—a, a], 


(19) 


fix) 



[—a,a] 


Figure 1 shows the co-motion functions for (17) and (19). 


( 20 ) 


2.2.2 The case N >2 and d =l 

In [8], the authors proved the existence of optimal transport maps for problem (14) 
in dimension d = 1 and provided an explicit construction of the optimal maps. Let 
p be the normalized electron density and —oo = xq < xi < ■ ■ ■ < xn = +°° be such 
that 
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Fig. 1 Right: Co-motion function for (17) with a = 2. Left: Co-motion function for (19) with a = 1. 


p {x)dx =\/Nyi = 0,--- ,N-\. 

Thus, there exists a unique increasing function /: M —> M on each interval 
such that for every test-function (p one has 


/ y{f(x))p{x)dx= 1 

J[Xi,Xi+l\ J 

^ (p{x)p{x)dx Vi = 0, •• 

[^Ci+lA+l] 

■,N-2, (21) 

/ (p{f{x))p{x)dx = / (p{x)p{x)dx, 

The optimal maps are then given by 

(22) 

f2{x) = 

fix) 

(23) 

fi{x) = 

f^\x) yi = 2,---,N, 

(24) 


where stands for the i—th composition of /2 with itself Here, we present an 
example given in [6]. We consider the case where p is the Lebesgue measure on the 
unit interval I = [0,1], the construction above gives the following optimal co-motion 
functions 


x-l-1/3 if x<2/3 
X —2/3 if x>2/3 


fsix) = fiifiix)) 


x-l-2/3 if X < 1/3 
X— 1/3 if X > 1/3 


(25) 


Furthermore, we know that the Kantorovich potential u satisfies the relation (here 
we take N = 3) 


u' (x) 


y X-fi{x) 

ti 


and by substituting the co-motion functions in (26) (and integrating it) we get 


(26) 
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(fx 0<x<l/3 

mW=<T 1/3<x<2/3 

[-fx+f 2/3<x<l 

Figure 2 illustrates this example. 


(27) 


When N > 4 similar arguments as above can be developed and we can similarly 
compute the co-motion functions and the Kantorovich potential. 



Fig. 2 Right: co-motion function f 2 for (25). Center: co-motion function fi for (25). Left: Kan¬ 
torovich Potential u{x) (27). 


2.2.3 The radially symmetric marginal case for N = 2, d>2 

We discuss now the radial t/—dimensional (d > 2) case for N = 2. We assume that 
the marginal p is radially symmetric, then we recall the following theorem from 
[ 10 ]: 

Theorem 1. [10] Suppose that p(x) = p(|x|), then the optimal transport map is 
given by 

f*{x) = ^g{\x\), xeK'', (28) 

\x\ 

withg{r) = —F 2 ^{Fi{r)), Fi{t) :=C(d) jQp{s)s‘^^'^ds, ^2(0 '■=C{d) p{s)s^^^ds 
where C{d) denotes the measure ofS‘^^^, the unit sphere in 

Example 1. (Spherical coordinates system) If p is radially symmetric p(x} = p(|x|), 
it is convenient to work in spherical coordinates and then to set for every radius r>0 

A(r)=C(dy-'p(r) (29) 

so that for every test-function (p we have 


J^^(p{x)p{\x\)dx = ^(p{r,CO) )A(r)t/r 
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with C{d) the measure of S‘‘ * and a the d—\ measure on S‘^ ^ which in particular 
implies that X := |.|#p i.e. 


(p{\x\)p{\x\)dx 



(p{r)X{r)dr, € Cc(M+). 


(30) 


The radial part of the optimal co-motion function a{r) = —g(r) can be computed by 
solving the ordinary differential equation 

a'(r)X(a(r)) = X(r) 


which gives 




(31) 


We define R(r) = Jq X(s)ds, since r n- /?(r) is increasing, its inverse R * (w) is well 
defined for w C [0,1). Thus, we see that a(r) has the form 


a(r) =R-‘(2-R(r)). 


(32) 


2.2.4 Reducing the dimension under radial symmetry 

In the case where the marginal p(x) = p(|x|) is radially symmetric, the multi¬ 
marginal problem with Coulomb cost 

inf / c(xi,---,XN)dy(xi,---, xn) (33) 

yen(p,---,p)jRdiv 

with c the Coulomb cost given by (11) involves plans on which is very costly to 
discretize. Fortunately, thanks to the symmetries of the problem, it can actually be 
solved by considering a multi-marginal problem only on . Let us indeed define 
for every (ri, • • • , G ( 0 , 

c(ri,--- ,rN) :=inf{c(xu--- ,xiv) : \x\\= r\,-■ ■ ,\xn\= r^). (34) 

Defining X by (29) (or equivalently (30)) and defining n{X,--- ,X) as the set of 
probability measures on having each marginal equal to X, consider 

~ inf / c(ri,---,rA,)<i 7 (ri,---,riv). (35) 

We claim that inf(33) = inf(35). The inequality inf(33) > inf(35) is easy: take 7 G 
n{p,--- ,p) and define its radial component 7 by 

[ ,rN)dY{ri,--- ,rN):= [ F'dxi , |xAr|)t/7(xi ,• • • ,XAr), VF G Cc(M+), 

(36) 
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it is obvious that 7 G iT(A, • • • ,X) and since c(xi, • • • ,xn) > c(|xi|,• • • , |xA^|), the 
inequality inf(33) > inf(35) easily follows. To show the converse inequality, we use 
duality. Indeed, by standard convex duality, we have 

inf(33) = sup-^A^ / u{x)p{x)dx : Yu{xi)<c{x\,---,XN)> (37) 

u 1 JwL‘‘ J 

and similarly 

inf(35) = sup|a^ / v(r)X(r)dr : J^v(n) < c(ri,-■ ■ ,rN)l. (38) 

V 1 Jr+ J 

Now since p is radially symmetric and the constraint of (37) is invariant by chang¬ 
ing u hy uoR with R a rotation (see (11)) , there is no loss of generality in re¬ 
stricting the maximization in (37) to potentials of the form m(x,) = w(r,), but then 
the constraint of (37) implies that w satisfies the constraint of (38). Then we have 
inf(33) = sup(37) < sup(38) = inf(35). Note then that 7 G fT(p, • • • ,p) solves (33) 
if and only if its radial component 7 solves (33) and c{x \, • • • ,xn) = c(|xi |, • • • , |xa^|) 
7 -a.e. Therefore (33) gives the optimal radial component, whereas the extra con¬ 
dition c{x\,--- ,xn) = c(|xi|,--- ,|xjv|) 7 -a.e. gives an information on the angular 
distribution of 7 . 


3 Iterative Bregman Projections 

Numerics for multi-marginal problems have so far not been extensively developed. 
Discretizing the multi-marginal problem leads to the linear program (41) where the 
number of constraints grows exponentially in N, the number of marginals. In this 
section, we present a numerical method which is not based on linear programming 
techniques, but on an entropic regularization and the so-called alternate projection 
method. It has recently been applied to various optimal transport problems in [12] 
and [2]. 

The initial idea goes back to von Neumann [26], [25] who proved that the se¬ 
quence obtained by projecting orthogonally iteratively onto two affine subspaces 
converges to the projection of the initial point onto the intersection of these affine 
subspaces. Since the seminal work of Bregman [3], it is by now well-known that 
one can extend this idea not only to several affine subspaces (the extension to con¬ 
vex sets is due to Dyskstra but we won’t use it in the sequel) but also by replacing 
the euclidean distance by a general Bregman divergence associated to some suit¬ 
able strictly and differentiable convex function / (possibly with a domain) where 
we recall that the Bregman divergence associated with / is given by 


Df{x,y)=f{x)-f{y)-{Vf{y),x-y). 


( 39 ) 
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In what follows, we shall only consider the Bregman divergence (also known 
as the Kullback-Leibler distance) associated to the Boltzmann/Shannon entropy 
f{x) := LiT,(logx, — 1) for non-negative x,. This Bregman divergence (restricted 
to probabilities i.e. imposing the normalization = 1) is the Kullback-Leibler 
distance or relative entropy: 


Df{x,y)=Y^Xi\og(-\ 

Bregman distances are used in many other applications most notably image pro¬ 
cessing, see [18] for instance. 


3.1 The Discrete Problem and its Entropic Regularization 


In this section we introduce the discrete problem solved using the iterative Bregman 
projections [3]. From now on, we consider the problem (12) 

min / c{x\,-■ ■ ,xn)Yn{xi,-■ ■ ,XN)dx\-■-dxN^ (40) 

yNe‘^J(W)N 


where N is the number of marginals (or electrons), c{xi, ...jXn) is the Coulomb cost, 
Yn the transport plan, is the probability distribution over and ^ := 

with % := {Yn & KiYn = p} (we remind the reader that electrons are 

indistinguishable so the N marginals coincide with p). 

In order to discretize (40), we use a discretisation with points of the support 
of the kth electron density as If the densities p are approximated by 




(41) 


where = cixj ^, • • • ,Xjf^) and the transport plan support for each coordinate 

is restricted to the points {xjj^^k = 1 ) • ’ ’ thus becoming a matrix again 

denoted 7 with elements Yj[,---JN- "^he marginal constraints % becomes 


^.:={ 7 eMW~| Y. = (42) 


Recall that the electrons are indistinguishable, meaning that they have same densi¬ 
ties : 

The discrete optimal transport problem (41) is a linear program problem and is 
dual to the discretization of (15) 
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M 

max ^ NujPj 

,/=l 

N 

s.t. ^ Uj. < Cj^ V;,' = 1, • • • 
1=1 


(43) 


where Uj = Uj. = u{xj.). Thus the primal (41) has (M^)^ unknown and x N lin¬ 
ear constraints and the dual (43) only unknown but still [M^)^ constraints. They 
are computationally not solvable with standard linear programming methods even 
for small cases in the multi-marginal case. 


A different approach consists in computing the problem (41) regularized by the 
entropy of the joint coupling. This regularization dates to E. Schrodinger [32] and 
it has been recently introduced in machine learning [12] and economics [15] (we 
refer the reader to [ 2 ] for an overview of the entropic regularization and the iterative 
Bregman projections in OT). Thus, we consider the following discrete regularized 
problem 




(44) 


where £( 7 ) is defined as follows 


£( 7 ) 


lii. -iiv 7/i. -.;ivlog(7;,. -,A) if 7> 0 

+00 otherwise. 


(45) 


After elementary computations, we can re-write the problem as 


min££( 7 | 7 ) 


(46) 


y. 

where ££( 717 ) = Lii7'i,...,w(log( )) is the Kullback-Leibler distance and 

7,.....<jv =e e . (47) 

As explained in section 1.2, when the transport plan 7 is concentrated on the 
graph of a transport map which solves the Monge problem, after discretisation of 
the densities, this property is lost along but we still expect the 7 matrix to be sparse. 
The entropic regularization will spread the support and this helps to stabilize the 
computation; it defines a strongly convex program with a unique solution "f which 
can be obtained through elementary operations (we detail this in section 3.3 for both 
the continuous and discrete framework). The regularized solutions "f then converge 
to 7 *, the solution of (41) with minimal entropy, as e —> 0 (see [9] for a detailed 
asymptotic analysis and the proof of exponential convergence). Let us now apply 
the iterative Bregman projections to find the minimizer of (46). 
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3.2 Alternate Projections 


The main idea of the iterative Bregman projections (we call it Bregman as the 
Kullback-Leibler distance is also called Bregman distance, see [3]) is to construct a 
sequence Y‘ (which converges to the minimizer of (46)) by alternately projecting on 
each set % with respect to the Kullback-Leibler distance. Thus, the iterative KL (or 
Bregman) projections can be written 

if =f ( 48 ) 

Ir =p§^ir-') vn>o 

where we have extended the indexing of the set by periodicity such that '^„+Ar = 

'rfn Vn C N and denotes the KL projection on 


The convergence of y" to the unique solution of (46) is well known, it actually 
holds for large classes of Bregman distances and in particular the Kullback-Leibler 
divergence as was proved by Bauschke and Lewis [1] 

y" —> as n —)• oo. 

Remarks. If the convex sets % are not affine sub-sets (that is not our case), 
converges toward a point of the intersection which is not the KL projection of f 
anymore so that a correction term is needed as provided by Dykstra’s algorithm (we 
refer the reader to [2]). 

The KL projection on / = 1,can be computed explicitly as detailed in the 
following proposition 

Proposition 1. For y G the projection P§^{y) is given by 

P%i.7)n,-,h = Pj. y -- yj, ^ ( 4 ^^ 

Proof. Introducing Lagrange multipliers Xj. associated to the constraint 

E yjw-M = Ph (50) 


the KL projection is given by the optimality condition : 

iog(f^)-4 = 0 


so that 




X 

where Cj. = e . If we substitute (52) in (50), we get 


(51) 


( 52 ) 
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which gives (49). 


^ji — Pji 




(53) 


3.3 From the Alternate Projections to the Iterative Proportional 
Fitting Procedure 


The alternate projection procedure (48) is performed on Mj matrices. Moreover 
each projection (49) involves computing partial sum of this matrix. The total opera¬ 
tion cost of each Bregman iteration scales like 


In order to reduce the cost of the problem, we use an equivalent formulation 
of the Bregman algorithm known as the Iterative Proportional Fitting Procedure 
(IPFP). Let us consider the problem (46) in a continous measure setting and, for 
simplicity, 2 -marginals framework 

r , . ^ (54) 

{y\^i(y)=pMy)=p}J «7 

where p, p and 7 are nonnegative measures. The aim of the IPFP is to find the KL 
projection of 7 on n(p,p) (see (47) for the definition of y which depends on the 
cost function). 

Under the assumption that the value of (54) is finite, Riischendorf and Thomsen 
(see [31]) proved that a unique KL-projection y* exists and that it is of the form 

y*{x,y) =a{x)b{y)y{x,y), a{x)>0, b{y)>0. (55) 


From now on, we consider (with a sligthly abuse of notation) Borel measures with 
densities 7 , y, p and p w.r.t. the suitable Lebesgue measure, a and b can be uniquely 
determined by the marginal condition as follows 


a{x) 

b{y) 


P{x) 

jy{x,y)b{y)dy’ 

pjy) 

f y(x,y)a(x)dx 


Then, IPFP is defined by the following recursion 

bo = l, ao = p, 

piy) 


bn+liy) = 
an+-iix) = 


f y(x,y)a„(x)dx’ 
P(x) 

fy(x,y)b„+i(y)dy' 


(56) 


( 57 ) 
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Moreover, we can define the sequence of joint densities (and of the corresponding 
measures) 

7 ^"(x,y) := a’'{x)b''{y)y{x,y) 7 ^"+* := a''{x)b''+\y)y{x,y), n>0. (58) 


Ruschendorf proved (see [30]) that y” converges to the KL-projection of 7 . We can, 
now, recast the IPFP in a discrete framework, which reads as 


7 ; — OibjYij, bj — 1, Oj — Pi, 


(59) 


^n+l _ Pj 

^ Ltyjar 

fl"+' = 


X,. .hfl+l ' 


i!=anjb) 


By definition of notice that 


y2fi 1 yZn 


and if (61) is re-written as follows 


yin 


^n +1 

bj 


= Pi 
= Pj 


'Lk fikbl 

yij< 

I^kfkjol 


(60) 

( 61 ) 


(62) 


then we obtain 



= Pi 


= Pj 


1 

bj 



,2n ' 


Lktk] 


(63) 


Thus, we exactly recover the Bregman algorithm described in the previous section, 
for 2 marginals. 


The extension to the multi-marginal framework is straightforward but cumber- 
sone to write. It leads to a problem set on N M^-dimensional vectors , j = 
1, • • • ,N, /(.) = 1, • • • ,Md. Each update takes the form 


^n+l 

i-ij 


Pij 


u, 




fh. 


■,‘N 


^n +1 


,n+l 

' 2.<2 


3«+l 

V 1 . 6 - 


(64) 


1 


Where each 4 takes values in {1, • • • ,Md}. 
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Note that we still need a constant cost matrix 7 . Thanks to the symmetry and 
separability properties of the cost function (see (11) and (47)), it is possible to re¬ 
place it by aN (N — 1 )/2 product of matrices. This is already a big improvement 
from the storage point of view. Further simplifications are under investigations but 
the brute force IPFP operational cost therefore scales like 0{N which pro¬ 
vides a small improvement over the Bregman iterates option. 


3.4 A heuristic refinement mesh strategy 

We will use a heuristic refinement mesh strategy allowing to obtain more accuracy 
without increasing the computational cost and memory requirements. This idea was 
introduced in [27] for the adaptative resolution of the pure Linear Programming 
formulation of the Optimal Transportation problem, i.e without the entropic regu- 
larisation. 

If the optimal transport plan is supported by a lower dimensional set, we ex¬ 
pect the entropic regularisation to be concentrated on a mollified version of this set. 
Its width should decrease with the entropic parameter e if the discretisation is fine 
enough. Working with a fixed e, the idea is to apply coarse to fine progressive reso¬ 
lution and work with a sparse matrix 7 . At each level, values below a threshold are 
filtered out (set to 0 ), then new positive values are interpolated on a finer grid (next 
level) where 7 is strictly positive. 

To simplify the exposition, we describe the algorithm for 2—marginals in ID and 
take a a/M gridpoints discretization of / = [a^b] G M: 

1. we start with a cartesian M gridpoints mesh on / x / to approximate transport 
plan "f, obtained by running the IPFP on a coarse grid. 

2 . we take mc{j) = maxijfj and mr{i) = maxj)fj which are the maximum values 
over the rows and over the columns respectively, and we define 

m = mm[mmj{mc{j)) ,mini{mr{i))]. 

We will refine the grid only inside the level curve "f = ^m where we expect the 
finer solution is supported, see figure 3. 

3. In order to keep approximately the same number of element in the sparse ma¬ 
trix 7 at each level we refine the grid as follows : Let := {{i,i)\yfj 

and := ‘^3^ and r \ — /M, then the size of the grid at the next level is 

=Mlr. 

4. We compute the interpolation Tw"™ of the old transport plan Ym on the finer grid. 

5. Elements of 7 m''<’“' below the fixed threshold <^m are filtered out, i.e are fixed to 0 
and are not used in the IPFP sum computations, see figure 3. 
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6 . Finally, a new IPFP computation is performed and it can be initialised with an 
interpolation of the data at the previous level (7 can be easly re-computed on the 
gridpoints where is strictly positive). 



Fig. 3 Left: SL is the set of grid points inside the level curve y — (^ = 0.91 (the bold line 

curve). Center: The new grid after the refinement. Right: The transport Plan after a new IPFP 
computation 


4 Numerical Results 

4.1 N = 2 electrons: comparison between numerical and analytical 
results 

In order to validate the numerical method, we now compare some numerical results 
for 2 electrons in dimension d = 1, • • • ,3 with the analytical results from section 
2.2. Let us first consider a uniform density (as (17) with a = 2) in ID. In table 1, we 
analyze the performance of the numerical method by varying the parameter e. We 
notice that the error becomes smaller by decreasing the regularizing parameter, but 
the drawback is that the method needs more iterations to converge. Figure 4 shows 
the Kantorovich potential, the co-motion function which can be recovered from the 
potential by using (16) and the transport plan. The simulation is performed with a 
discretization of (17) with a = 2, M = 1000 (gridpoints) and e = 0.004. 

As explained in section 2.2.3, we can also compute the co-motion for a radially 
symmetric density. We have tested the method in 2D and 3D, figure 5 and 6 respec¬ 
tively, by using the normalized uniform density on the unit ball. Moreover, in the 
radial case we have proved that the OT problem can be reduced to a 1 —dimensional 
problem by computing c which is trivial for the 2 electrons case: let us set the prob¬ 
lem in 2D in polar coordinates (ri,0i) and (^ 2 , 62)5 for the first and the second 
electron respectively (without loss of generality we can set 9i = 0 ), then it is easy to 
verify that the minimum is achieved with 62 = 7t. Figure 5 shows the Kantorovich 
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potential (the radial component v(r) as defined in section 2.2.4), the co-motion and 
the transport plan for the 2—dimensional case, the simulation is performed with 
M = 1000 and e = 0.002. In figure 6 we present the result for th 3—dimensional 
case, the simulation is performed with M — 1000 and e = 0.002. 

Remark 4. One can notice that, in the case of a uniform density, the transport plan 
presents a concentration of mass on the boundaries. This is a combined effect of the 
regularization and of the fact that the density has a compact support. 


£ 

Error ( — «||„o/ |m |=o) 

Iteration 

0.256 

0.1529 

11 

0.128 

0.0984 

16 

0.064 

0.0578 

25 

0.032 

0.0313 

38 

0.016 

0.0151 

66 

0.008 

0.0049 

114 

0.004 

0.0045 

192 


Table 1 Numerical results for uniform density in ID. is the numerical Kantorovich potential 
and u is the analytical one. 



Fig. 4 Top-Left: Kantorovich Potential u(x). Top-Right: Numerical co-motion function (solid line) 
and analytical co-motion (star-solid line). Bottom-Left: Transport plan f. Bottom-Right: Support 

off- 
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Fig. 5 Top-Left: Kantorovich Potential v{r). Top-Right: Numerical co-motion function (solid line) 
and analytical co-motion (star-solid line) . Bottom-Left: Transport plan f. Bottom-Right: Support 

ofr 


4.2 N = 2 electrons in dimension d = 3 : Helium atom 

Once we have validated the method with some analytical examples, we solve the 
regularized problem for the Helium atom by using the electron density computed 
in [13]. In figure 7, we present the electron density, the Kantorovich potential and 
the transport plan. The simulation is performed with a discretization of [0,4] with 
M = 1000 and e = 5 10^^. We can notice the potential correctly fits the asymptotic 

N- 1 

behaviour from [33], namely v(r) ~ ——— for r —> oo, where N is the number of 

kl 

electrons. 


4.3 N = 3 electrons in dimension d = I 

We present now some results for the 1 —dimensional multi-marginal problem with 
N = 3. They are validated against the analytical solutions given in section 2.2.2. We 
recall that splitting p into three tertiles p, with equal mass, we will have pi —> p 2 . 
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Fig. 6 Top-Left: Kantorovich Potential v{r). Top-Right: Numerical co-motion function (solid line) 
and analytical co-motion (star-solid line) . Bottom-Left: Transport plan f. Bottom-Right: Support 

ofr 


P 2 P 3 and P 3 ^Pi- 


In table 2, we present the perfomance of the method for a uniform density on 
[ 0 , 1 ] by varying e and, as expected, we see the same behaviour as in the 2 marginals 
case. Figure 8 shows the Kantorovich potential and the projection of the transport 
plan onto two marginals (namely 7 ^ = The support gives the relative po¬ 

sitions of two electrons. 

The simulation is performed on a discretization of [0,1] with a uniform density, 
M = 1000 and e = 0.02. If we focus on the support of the projected transport plan 
we can notice that the numerical solution correctly reproduces the prescribed be¬ 
havior The concentration of mass is again due to the compact support of the density, 
which is not the case of the gaussian as one can see in figure 9. In figure 9 we 
present the numerical results for p(x) = /v^- The simulation is performed on 

the discretization of [—2.5,2.5] with M = 1000 and e = 0.008. 
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Fig. 7 Top-Left: Helium density X{r) = 4xr^p(r). Top-Right: Kantorovich Potential v{r) (blue) 
and asymptotic behaviour (red) v(r) ^ * r —> <». Bottom-Left: Transport plan y. Bottom-Right: 
Support of y. All quantities are in Hartree atomic units. 


e 

Error (\\u‘^ — « |„o/ |m =o) 

Iteration 


0.0658 

121 


0.0373 

230 

0.08 

0.0198 

446 

0.04 

0.0097 

878 

0.02 

0.0040 

1714 


Table 2 Numerical results for uniform density in ID and three electrons. is the numerical 
Kantorovich potential and u is the analytical one. 


4.4 N = 3 electrons in dimension d = 3 radial case : Litium atom 

We finally perform some simulations for the radial 3—dimensional case for N = 3. 
As for the 3—dimensional case with 2 marginals we solve the reduced problem; let 
us consider the spherical coordinates (r,, 0,, 0,) with i = 1, • • • ,3 and we fix 0i = 0 
and = 02 = 0 (the first electrons defines the z axis and the second one is on the 
xz plane). We then notice that 03 = 0 as the electrons must be on the same plane 
of the nucleus to achieve compensation of forces (one can see it by computing the 
optimality conditions), so we have to minimize on 02 and 03 in order to obtain c. 
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Fig. 8 Left: Numerical Kantorovich potential u{x) (solid line) and analytical potential (star-solid 
line). Center: Projection of the transport plan 7ti2{y{x,y,z)). Rigth: Support of Ki 2 {j{x,y,z)) The 
dot-dashed lines delimit the intervals where pi, with i = 1, • • • , 3, are defined. 



Fig. 9 Left: Kantorovich potential u(x). Center: Projection of the tramport plan n\ 2 {'Y(x,y,z)). 
Rigth: Support of Ki 2 {Y{x,y,z)). The dot-dashed lines delimit the intervals where p,, with i = 
1, • • • ,3, are defined. 


Figure 10 shows the electron density of the Litium (computed in [5]), the Kan¬ 
torovich Potential (and the asymptotic behavior) and the projection of the transport 
plan onto two marginals The support gives the relative positions of 

two electrons. 

The simulation is performed on a discretization of [0,8] with M = 300 and e = 
0.007. Our results show (taking into account the regularization effect) a concentrated 
transport plan for this kind of density and they match analogous result obtained in 
[33]. If we focus on the support of the transport plan we can notice that the optimal 
solution forces the electrons to occupy three different regions as conjectured in [33]. 


5 conclusion 

We have presented a numerical scheme for solving multi-marginal OT problems 
arising from DFT. This is a challenging problems, not only because of the unusual 
features of the Coulomb cost which is singular and repulsive but also due to the high 
dimension of the space of plans. 

Using an entropic regularization gives rise to a Kullback-Leibler projection prob¬ 
lem onto the intersection of affine subsets given by the marginal constraints. Because 
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Fig. 10 Top-Left: Litium density Xir) = 4;rr^p(r). Top-Right: Kantorovich Potential v{r) (blue) 
and asymptotic behaviour (red) v(r) ^ ^ r —)■ oo. Bottom-Left: Projection of the Transport plan 
Bottom-Right: Support of the projected transport plan The dot-dashed lines de¬ 

limit the three regions that the electrons must occupy, we computed them numerically following the 
idea in [33],All quantities are in Hartree atomic units. 


each projection is explicit, one can use Bregman’s iterative projection algorithm to 
approximate the solution. 

The power of such an iterative projection approach was recently emphasized in 
[12, 2] for the entropic regularization of optimal transport problems, we showed that 
is also well suited to treat the multi-marginal OT problem with Coulomb cost and 
leads to the same benefits in terms of convexification of the problem and simplicity 
of implemention. 

The method presented here is just a preliminary step which is simple to imple¬ 
ment and therefore easy to use in practice. The cost of solving the general DFT 
problem in dimension 3 for a large number of electrons is still unfeasible and we 
need to use radial symmetry simplification and also a heuristic refinement mesh 
strategy. 

A lot of questions are left for future research : can IPFP be used for sharper 
approximations for DFT? Can one justify rigorously and quantitatively the mesh re¬ 
finement strategy? How should the regularization parameter e be chosen in practice? 
Does the entropic regularization have a physical interpretation? 
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